3574 results found.
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
None Production Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:UncommonVoice: A Crowdsourced Dataset of Dysphonic Speech
-
Paper track:1.11 Speech and voice disorders/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Meredith Moore | UncommonVoice | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Bilingual
Languages:
English Yue Chinese
Availability:
Public release planned for later this year
License:
Size:
27 GByte Production Status:
Newly created-in progress
Use:
Linguistic Research
-
Paper title:Bilingual acoustic voice variation is similarly structured across languages
-
Paper track:1.8 Code switching and multilingual studies/Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Khia A. Johnson | SpiCE corpus of conversational bilingual Speech in Cantonese and English | /N |
Documentation:
https://spice-corpus.readthedocs.io/
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
60 hours Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:UNSW System Description for the Shared Task on Automatic Speech Recognition for Non-Native Children’s Speech
-
Paper track:13.2 Automatic Speech Recognition for Non-Native C/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mostafa Shahin | OGI Kids speech corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
1 GByte Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:MultiSpeech: Multi-Speaker Text to Speech with Transformer
-
Paper track:7.5 Towards end-to-end speech synthesis/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Xu Tan | VCTK | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
1 GByte Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:MultiSpeech: Multi-Speaker Text to Speech with Transformer
-
Paper track:7.5 Towards end-to-end speech synthesis/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Xu Tan | LibriTTS | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons
Size:
1.91 GByte Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Small-Footprint Keyword Spotting with Multi-Scale Temporal Convolution
-
Paper track:5.5 Speech and audio classification/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ximin Li | Google’s Speech Commands Dataset | /N |
Documentation:
Yes, English, Yes
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Free within participation in Interspeech 2020 ComParE
Size:
49 x 5 minutes Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:The INTERSPEECH 2020 Computational Paralinguistics Challenge: Elderly Emotion, Breathing & Masks
-
Paper track:13.15 The INTERSPEECH 2020 Computational Paralingu/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Björn Schuller | UCL Speech Breath Monitoring (UCL-SBM) | /N |
Documentation:
None
Speech, noise and room impulse response data,
Language Type:
Multilingual
Languages:
English German Spanish
Availability:
via request, maybe public in the future?
License:
Creative Commons Attribution-NonCommercial 4.0 International Public License
Size:
2.3Gbyte OtherProduction Status:
released
Use:
Development of speech-enhancement algorithms
-
Paper title:Optimization and evaluation of an intelligibility-improving signal processing approach (IISPA) for the Hurricane Challenge 2.0 with FADE
-
Paper track:13.4 Intelligibility-enhancing Speech Modification/Oral Presentation
-
Paper status:Accept Special Session
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marc René Schädler | Hurricane Challenge 2.0 development set | /N |
Documentation:
Yes, english
pos-tagging
Bilingual corpora from Europarl (Koehn, 2005),
Language Type:
Multilingual
Languages:
English French German
Availability:
License:
Size:
2M tokens Production Status:
Use:
Machine Translation, contratsive analysis
-
Paper title:The Learnability of the Annotated Input in NMT Replicating (Vanmassenhove and Way, 2018) with OpenNMT
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Nicolas Ballier | Annotated Europarl | /N |
Documentation:
Koehn 2005 paper
,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
non-commercial
Size:
1224 exam scripts OtherProduction Status:
Existing-used
Use:
Learner Language Analysis
-
Paper title:Reproduction and Replication: A Case Study with Automatic Essay Scoring
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eva Huber | First Certificate in English (FCE) exams of Cambridge Learner Corpus (CLC) | /N |
Documentation:
None




